118 research outputs found
Inverse Reinforcement Learning for Marketing
Learning customer preferences from an observed behaviour is an important
topic in the marketing literature. Structural models typically model
forward-looking customers or firms as utility-maximizing agents whose utility
is estimated using methods of Stochastic Optimal Control. We suggest an
alternative approach to study dynamic consumer demand, based on Inverse
Reinforcement Learning (IRL). We develop a version of the Maximum Entropy IRL
that leads to a highly tractable model formulation that amounts to
low-dimensional convex optimization in the search for optimal model parameters.
Using simulations of consumer demand, we show that observational noise for
identical customers can be easily confused with an apparent consumer
heterogeneity.Comment: 18 pages, 5 figure
Quantum KAM Technique and Yang-Mills Quantum Mechanics
We study a quantum analogue of the iterative perturbation theory by
Kolmogorov used in the proof of the Kolmogorov-Arnold-Moser (KAM) theorem. The
method is based on sequent canonical transformations with a "running" coupling
constant \lm,\lm^{2},\lm^{4} etc. The proposed scheme, as its classical
predecessor, is "superconvergent" in the sense that after the n-th step, a
theory is solved to the accuracy of order \lm^{2^{n-1}} . It is shown that
the Kolmogorov technique corresponds to an infinite resummation of the usual
perturbative series. The corresponding expansion is convergent for the quantum
anharmonic oscillator due to the fact that it turns out to be identical to the
Pade series. The method is easily generalizable to many-dimensional cases. The
Kolmogorov technique is further applied to a non-perturbative treatment of
Yang-Mills quantum mechanics. A controllable expansion for the wave function
near the origin is constructed. For large fields, we build an asymptotic
adiabatic expansion in inverse powers of the field. This asymptotic solution
contains arbitrary constants which are not fixed by the boundary conditions at
infinity. To find them, we approximately match the two expansions in an
intermediate region. We also discuss some analogies between this problem and
the method of QCD sum rules.Comment: 26 pages, latex, no figure
QLBS: Q-Learner in the Black-Scholes(-Merton) Worlds
This paper presents a discrete-time option pricing model that is rooted in
Reinforcement Learning (RL), and more specifically in the famous Q-Learning
method of RL. We construct a risk-adjusted Markov Decision Process for a
discrete-time version of the classical Black-Scholes-Merton (BSM) model, where
the option price is an optimal Q-function, while the optimal hedge is a second
argument of this optimal Q-function, so that both the price and hedge are parts
of the same formula. Pricing is done by learning to dynamically optimize
risk-adjusted returns for an option replicating portfolio, as in the Markowitz
portfolio theory. Using Q-Learning and related methods, once created in a
parametric setting, the model is able to go model-free and learn to price and
hedge an option directly from data, and without an explicit model of the world.
This suggests that RL may provide efficient data-driven and model-free methods
for optimal pricing and hedging of options, once we depart from the academic
continuous-time limit, and vice versa, option pricing methods developed in
Mathematical Finance may be viewed as special cases of model-based
Reinforcement Learning. Further, due to simplicity and tractability of our
model which only needs basic linear algebra (plus Monte Carlo simulation, if we
work with synthetic data), and its close relation to the original BSM model, we
suggest that our model could be used for benchmarking of different RL
algorithms for financial trading applicationsComment: 30 pages (minor changes in the presentation, updated references
Implied Multi-Factor Model for Bespoke CDO Tranches and other Portfolio Credit Derivatives
This paper introduces a new semi-parametric approach to the pricing and risk
management of bespoke CDO tranches, with a particular attention to bespokes
that need to be mapped onto more than one reference portfolio. The only user
input in our framework is a multi-factor model (a "prior" model hereafter) for
index portfolios, such as CDX.NA.IG or iTraxx Europe, that are chosen as
benchmark securities for the pricing of a given bespoke CDO. Parameters of the
prior model are fixed, and not tuned to match prices of benchmark index
tranches. Instead, our calibration procedure amounts to a proper reweightening
of the prior measure using the Minimum Cross Entropy method. As the latter
problem reduces to convex optimization in a low dimensional space, our model is
computationally efficient. Both the static (one-period) and dynamic versions of
the model are presented. The latter can be used for pricing and risk management
of more exotic instruments referencing bespoke portfolios, such as
forward-starting tranches or tranche options, and for calculation of credit
valuation adjustment (CVA) for bespoke tranches.Comment: 40 pages, 10 figure
Keep It Real: Tail Probabilities of Compound Heavy-Tailed Distributions
We propose an analytical approach to the computation of tail probabilities of
compound distributions whose individual components have heavy tails. Our
approach is based on the contour integration method, and gives rise to a
representation of the tail probability of a compound distribution in the form
of a rapidly convergent one-dimensional integral involving a discontinuity of
the imaginary part of its moment generating function across a branch cut. The
latter integral can be evaluated in quadratures, or alternatively represented
as an asymptotic expansion. Our approach thus offers a viable (especially at
high percentile levels) alternative to more standard methods such as Monte
Carlo or the Fast Fourier Transform, traditionally used for such problems. As a
practical application, we use our method to compute the operational Value at
Risk (VAR) of a financial institution, where individual losses are modeled as
spliced distributions whose large loss components are given by power-law or
lognormal distributions. Finally, we briefly discuss extensions of the present
formalism for calculation of tail probabilities of compound distributions made
of compound distributions with heavy tails.Comment: 23 pages, 3 figure
The QLBS Q-Learner Goes NuQLear: Fitted Q Iteration, Inverse RL, and Option Portfolios
The QLBS model is a discrete-time option hedging and pricing model that is
based on Dynamic Programming (DP) and Reinforcement Learning (RL). It combines
the famous Q-Learning method for RL with the Black-Scholes (-Merton) model's
idea of reducing the problem of option pricing and hedging to the problem of
optimal rebalancing of a dynamic replicating portfolio for the option, which is
made of a stock and cash. Here we expand on several NuQLear (Numerical
Q-Learning) topics with the QLBS model. First, we investigate the performance
of Fitted Q Iteration for a RL (data-driven) solution to the model, and
benchmark it versus a DP (model-based) solution, as well as versus the BSM
model. Second, we develop an Inverse Reinforcement Learning (IRL) setting for
the model, where we only observe prices and actions (re-hedges) taken by a
trader, but not rewards. Third, we outline how the QLBS model can be used for
pricing portfolios of options, rather than a single option in isolation, thus
providing its own, data-driven and model independent solution to the (in)famous
volatility smile problem of the Black-Scholes model.Comment: 18 pages, 5 figure
Bayesian Entropic Inverse Theory Approach to Implied Option Pricing with Noisy Data
A popular approach to nonparametric option pricing is the Minimum Cross
Entropy (MCE) method based on minimization of the relative Kullback-Leibler
entropy of the price density distribution and a given reference density, with
observable option prices serving as constraints. When market prices are noisy,
the MCE method tends to overfit the data and often becomes unstable. We propose
a non-parametric option pricing method whose input are noisy market prices of
arbitrary number of European options with arbitrary maturities. Implied
transition densities are calculated using the Bayesian inverse theory with
entropic priors, with a reference density which may be estimated by the
algorithm itself. In the limit of zero noise, our approach is shown to reduce
to the canonical MCE method generalized to a multi-period case. The method can
be used for a non-parametric pricing of American/Bermudan options with a
possible weak path dependence.Comment: 23 pages, 6 figure
"Integrating in" and Effective Lagrangian for Non-Supersymmetric Yang-Mills Theory
Recently a non-supersymmetric analog of Veneziano-Yankielowicz (VY) effective
Lagrangian has been proposed and applied for the analysis of the theta
dependence in pure Yang-Mills theory. This effective Lagrangian is similar in
many respects to the VY construction and, in particular, exhibits a kind of low
energy holomorphy which is absent in the full YM theory. Here we incorporate a
heavy fermion into this effective theory by using the "integrating in"
technique. We find that, in terms of this extended theory, holomorphy of the
effective Lagrangian for pure YM theory naturally implies a holomorphic
dependence on the heavy fermion mass.
It is shown that this analysis fixes, under certain assumptions, a
dimensionless parameter which enters the effective Lagrangian and determines
the number of nondegenerate vacuum sectors in pure YM theory. We also compare
our results for the vacuum structure and theta dependence to those obtained
recently by Witten on the basis of AdS/CFT correspondence.Comment: Latex 17 pages, no figures. Discussion is extended and new references
are added. Final version to appear in Nucl. Phys.
BSLP: Markovian Bivariate Spread-Loss Model for Portfolio Credit Derivatives
BSLP is a two-dimensional dynamic model of interacting portfolio-level loss
and spread (more exactly, loss intensity) processes. The model is similar to
the top-down HJM-like frameworks developed by Schonbucher (2005) and
Sidenius-Peterbarg-Andersen (SPA) (2005), however is constructed as a
Markovian, short-rate intensity model. This property of the model enables fast
lattice methods for pricing various portfolio credit derivatives such as
tranche options, forward-starting tranches, leveraged super-senior tranches
etc. A non-parametric model specification is used to achieve nearly perfect
calibration to liquid tranche quotes across strikes and maturities. A
non-dynamic version of the model obtained in the zero volatility limit of
stochastic intensity is useful on its own as an arbitrage-free interpolation
model to price non-standard index tranches off the standard ones.Comment: 42 pages, 9 figure
Climbing Down from the Top: Single Name Dynamics in Credit Top Down Models
In the top-down approach to multi-name credit modeling, calculation of singe
name sensitivities appears possible, at least in principle, within the
so-called random thinning (RT) procedure which dissects the portfolio risk into
individual contributions. We make an attempt to construct a practical RT
framework that enables efficient calculation of single name sensitivities in a
top-down framework, and can be extended to valuation and risk management of
bespoke tranches. Furthermore, we propose a dynamic extension of the RT method
that enables modeling of both idiosyncratic and default-contingent individual
spread dynamics within a Monte Carlo setting in a way that preserves the
portfolio "top"-level dynamics. This results in a model that is not only
calibrated to tranche and single name spreads, but can also be tuned to
approximately match given levels of spread volatilities and correlations of
names in the portfolio.Comment: 34 pages, 9 figure
- …